Using next-generation DNA sequence data for genetic association tests based on allele counts with and without consideration of zero inflation
نویسندگان
چکیده
The relationship between genetic variability and individual phenotypes is usually investigated by testing for association relying on called genotypes. Allele counts obtained from next-generation sequence data could be used for this purpose too. Genetic association can be examined by treating alternative allele counts (AACs) as the response variable in negative binomial regression. AACs from sequence data often contain an excess of zeros, thus motivating the use of Hurdle and zero-inflated models. Here we examine rough type I error rates and the ability to pick out variants with small probability values for 7 different testing approaches that incorporate AACs as an explanatory or as a response variable. Model comparisons relied on chromosome 3 DNA sequence data from 407 Hispanic participants in the Type 2 Diabetes Genetic Exploration by Next-generation sequencing in Ethnic Samples (T2D-GENES) project 1 with complete information on diastolic blood pressure and related medication. Our results suggest that in the investigation of the relationship between AAC as response variable and individual phenotypes as explanatory variable, Hurdle-negative binomial regression has some advantages. This model showed a good ability to discriminate strongly associated variants and controlled overall type I error rates. However, probability values from Hurdle-negative binomial regression were not obtained for approximately 25 % of the investigated variants because of convergence problems, and the mass of the probability value distribution was concentrated around 1.
منابع مشابه
Single Nucleotide Polymorphisms and Association Studies: A Few Critical Points
Uncovering DNA sequence variations that correlate with phenotypic changes, e.g., diseases, is the aim of sequence variation studies. Common types sequence variations are Single nucleotide polymorphism (SNP, pronounced snip).SNPs are the third-generation molecular marker. SNP represents a DNA sequence variant of a single base pair with the minor allele occurring in more than 1% of a given popula...
متن کاملStrategies and Clinical Applications of Next Generation Sequencing
Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput sequencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...
متن کاملStrategies and Clinical Applications of Next Generation Sequencing
Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput sequencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...
متن کاملUse of Microsatellite Polymorphisms in Ovar-DRB1 Gene for Identifying Genetic Resistance in Fat-Tailed Ghezel Sheep to Gastrointestinal Nematodes
This study was designed to identify genetically resistant animals to gastrointestinal nematode (GIN) infections using microsatellite polymorphisms of Ovar-DRB1 gene in Iranian Ghezel sheep breed lambs. In the present study 120 male Ghezel lambs were at 4 to 6 months of age randomly selected from six different sheep flocks in East Azerbaijan province (n=20 per flock). These lambs were naturally ...
متن کاملSimple Sequence Repeats Amplification: a Tool to Survey the Genetic Background of Olive Oils
A reliable DNA extraction method for use on extra virgin olive oil based on a commercial kit was defined, and the possibility of using this DNA for fingerprinting the original cultivar was demonstrated. The genetic traceability of single-cultivar virgin olive oil from two cultivars (Carolea and Frantoio) was achieved by identifying the varieties from which they were produced. This involved the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 10 شماره
صفحات -
تاریخ انتشار 2016